A Hybrid Scavenger Grid Approach to Intranet Search
نویسندگان
چکیده
According to a 2007 global survey of 178 organisational intranets, 3 out of 5 organisations are not satisfied with their intranet search services. However, as intranet data collections become large, effective full-text intranet search services are needed more than ever before. To provide an effective full-text search service based on current information retrieval algorithms, organisations have to deal with the need for greater computational power. Hardware architectures that can scale to large data collections and can be obtained and maintained at a reasonable cost are needed. Web search engines address scalability and cost-effectiveness by using large-scale centralised cluster architectures. The scalability of cluster architectures is evident in the ability of Web search engines to respond to millions of queries within a few seconds while searching very large data collections. Though more cost-effective than high-end supercomputers, cluster architectures still have relatively high acquisition and maintenance costs. Where information retrieval is not the core business of an organisation, a cluster-based approach may not be economically viable. A hybrid scavenger grid is proposed as an alternative architecture — it consists of a combination of dedicated and dynamic resources in the form of idle desktop workstations. From the dedicated resources, the architecture gets predictability and reliability whereas from the dynamic resources it gets scalability. An experimental search engine was deployed on a hybrid scavenger grid and evaluated. Test results showed that the resources of the grid can be organised to deliver the best performance by using the optimal number of machines and scheduling the optimal combination of tasks that the machines perform. A system-efficiency and cost-effectiveness comparison of a grid and a multi-core machine showed that for workloads of modest to large sizes, the grid architecture delivers better throughput per unit cost than the multi-core, at a system efficiency that is comparable to that of the multi-core. The study has shown that a hybrid scavenger grid is a feasible search engine architecture that is cost-effective and scales to mediumto large-scale data collections.
منابع مشابه
A Hybrid Distributed Architecture for Indexing
This paper presents a hybrid scavenger grid as an underlying hardware architecture for search services within digital libraries. The hybrid scavenger grid consists of both dedicated servers and dynamic resources in the form of idle workstations to handle mediumto large-scale search engine workloads. The dedicated resources are expected to have reliable and predictable behaviour. The dynamic res...
متن کاملA Scavenger Grid for Intranet Indexing
Digital library services, such as searching and browsing, are increasingly needed in more restricted environments than the public Web. This paper proposes a scavenger Grid of idle desktop workstations to support computationally-intensive indexing services. A prototype software system was developed using commodity Grid middleware and information retrieval tools. This system demonstrated that the...
متن کاملHYBRID PARTICLE SWARM OPTIMIZATION, GRID SEARCH METHOD AND UNIVARIATE METHOD TO OPTIMALLY DESIGN STEEL FRAME STRUCTURES
This paper combines particle swarm optimization, grid search method and univariate method as a general optimization approach for any type of problems emphasizing on optimum design of steel frame structures. The new algorithm is denoted as the GSU-PSO. This method attempts to decrease the search space and only searches the space near the optimum point. To achieve this aim, the whole search space...
متن کاملModified Harmony Search Algorithm Based Unit Commitment with Plug-in Hybrid Electric Vehicles
Plug-in Hybrid Electric Vehicles (PHEV) technology shows great interest in the recent scientificliteratures. Vehicle-to-grid (V2G) is a interconnection of energy storage of PHEVs and grid. Byimplementation of V2G dependencies of the power system on small expensive conventional units canbe reduced, resulting in reduced operational cost. This paper represents an intelligent unitcommitment (UC) wi...
متن کاملOptimization of grid independent diesel-based hybrid system for power generation using improved particle swarm optimization algorithm
The power supply of remote sites and applications at minimal cost and with low emissions is an important issue when discussing future energy concepts. This paper presents modeling and optimization of a photovoltaic (PV)/wind/diesel system with batteries storage for electrification to an off-grid remote area located in Rafsanjan, Iran. For this location, different hybrid systems are studied and ...
متن کامل